[contents] [next] [bottom] (1 out of 3)

Chapter 7 - Collections

Collections

Collections are classes that implement common aggregate data structures such as arrays, linked lists, and hash tables. The Collection class and its subclasses define behavior for collections and provide methods for adding and removing elements in collections, for querying and changing those elements, and for searching for elements in collections.

This section is not an exhaustive description of collection classes in ScriptX. It is intended as an introduction to the most useful classes and the most common ways of using those classes. Figure 7-1 depicts an abbreviated class-inheritance diagram for the Collections component. The sections that follow contain a short summary of many of the available collection classes, and of the most commonly used methods. For more information on collections, see the ScriptX Components Guide and the ScriptX Class Reference.

The ScriptX Collection Classes

Figure 7-1: A few of the collection classes

Array, ArrayList, and LinkedList

The Array class provides the simplest data structure of the Collection classes, containing simply a linear array of elements. LinkedList is similar to Array, except it is implemented as a linked list of elements. Arrays are generally allocated with some number of "slots," (typically 20), and grow by an incremental number of elements when they run out of slots. Linked lists are variable in size; elements are added and removed dynamically as needed. The ArrayList class provides a hybrid of Array and LinkedList; its internal structure is like a linked list of individual arrays. Arrays are typically faster for querying and changing elements, but linked lists are faster for inserting elements.

Array, ArrayList, and LinkedList are implicitly keyed sequences (they inherit from the abstract classes Sequence and ImplicitlyKeyedCollection), which means that their elements are accessed through their position in the list or array and not by some explicit key. Sequences begin with the ordinal position 1 (that is, the first element in a sequence is ordinal position 1). The following code illustrates using ordinal position to access an element of an array:

-- using the element access construct to get the first element
myArray[1] 
-- using the method getOne to access the eighth element
getOne myArray 8

KeyedLinkedList

The class KeyedLinkedList is a list of explicit key-value pairs. Whereas sequences can be said to have implicit keys (their keys are their ordinal positions), instances of KeyedLinkedList have explicit keys. The values associated with those keys are usually accessed via the keys themselves.

SortedArray, SortedKeyedArray

The SortedArray and SortedKeyedArray classes provide implementations of an array and an array of key-value pairs, respectively, that are always sorted. If you add new elements to a sorted collection, they are inserted based on where they should appear in the sorting order.

You have the option of specifying the function that will be used to sort keys or items in the array. For example, if your sorted array contains items that are instances of some class, items can be sorted on the value of a particular instance variable. For a discussion of sorting with these classes, see the "Collections" chapter in the ScriptX Components Guide.


Important - Note that sorted arrays and sorted keyed arrays do not resort themselves if you modify one of their elements. If you modify a sorted array, you should do so by explicitly removing the item from the array, modifying it, and adding it back, so that the array will be sorted properly.

Ranges

Ranges are used to represent a range of numbers between two values. Ranges are divided into two broad categories, those which contain discrete numbers and those which do not. In the first category is the class DiscreteRange, which inherits from Range and Sequence and has subclasses IntegerRange and NumberRange. These range classes containing discrete numbers are subclasses of Collection. Ranges containing numbers which are not discrete belong to ContinuousNumberRange and do not inherit from Collection.

Single, Pair, Triple, and Quad

Single, Pair, Triple, and Quad are bounded sequences, that is, they can only contain a specific number of elements (one, two, three, and four, respectively). These classes take up less memory than a more general sequence (such as an array) with the same number of elements. Also, they provide slightly better performance because they are accessed directly, eliminating one level of indirection. Therefore, it is recommended that you use a Single, Pair, Triple, or Quad whenever it is appropriate.

global args := #(myString, 1, 0, 999) as Quad

Other Classes

ScriptX collections include two other classes which may be of interest:

Array and KeyedLinkedList Literals

Chapter 2, "ScriptX Building Blocks," described the literals in ScriptX for specifying instances of the Array and KeyedLinkedList classes. They are reproduced here for a summary:

#( element, element, element, . . . )
#( key:value, key:value, . . . )

Range Literals

Ranges, such as those used in the for loop examples, are a special literal that creates an instance of the appropriate subclass of Range.

startValue to endValue
startValue
to endValue by increment
In both forms, startValue is the number to begin the range, and endValue is the number to end the range.

1 to 10
1 to 56

To iterate down from one number to another, put the larger number first in the range:

10 to 1 by -2
100 to 50 by -1

The by increment clause allows you to specify the amount by which each number in the range is incremented or decremented. If startValue is lower than endValue, the by increment clause is optional. In this case, there is a default value for increment, which is 1. If startValue is larger than endValue, the increment must be explicitly stated.

1 to 10 by 2
1 to 100 by 7
0 to 1 by 0.1
10 to 1 by -1
88 to 0 by -8

Although the startValue, endValue, and increment are often numbers, they can also be an expression that results in a number. Keep in mind that most complex expressions must be specified inside parentheses for those expressions to be evaluated before the range expression. The following expressions can be used without parentheses, where appropriate:

Continuous Ranges

The range literal generally evaluates to an instance of the IntegerRange or NumberRange class. There is also a literal for ContinuousNumberRange, a range of all the numbers (floating-point and integer) between the end-points of the range. Although you can test the membership of a given number in a ContinuousNumberRange, you cannot count or list its elements.

To create an instance of ContinuousNumberRange, use the following literal syntax:

startNum to endNum continuous 
The continuous reserved word indicates that the range is to be continuous. You also have the option of specifying the exclusivity of the numbers that begin and end the continuous range. There are two reserved words for exclusivity:

exclusive
inclusive
Either the startNum or the endNum, or both, can be specified as exclusive or inclusive, where exclusive means that the range includes all numbers except that number itself, and inclusive means the range includes the number specified. The default is inclusive.

1 to 3 continuous
1 exclusive to 3 inclusive continuous
1 to 3 exclusive continuous

Note that in the last example, the exclusive reserved word only applies to the end number of the range. You must specify exclusivity separately for both ends of the range.

Creating New Instances of Collections

Just as with other objects you can use either the new method or the object expression to create a new instance of a subclass of Collection. You can specify the initial contents of a collection using the contents reserved word:

object myList (Array)
	contents "penny", "nickel", "dime", "quarter"
end
#("penny", "nickel", "dime", "quarter")

Probably the easiest way to create a new instance of a collection with an initial set of values is to use either the array or keyed list literal (as described on page 154), and then coerce the resulting Array or KeyedLinkedList object into an instance of the appropriate class. See the next section "Coercing Between Collection Classes" for examples.

Coercing Between Collection Classes

Collections can be coerced into other collection classes by using the as reserved word with the new collection class.

oldCollection as newCollection
In this syntax, oldCollection is an expression yielding the original collection, and newCollection is the class of the collection to which it is coerced (or an expression resulting in a class).

col := #(4,2,1,3)
getClass col
Array
col as SortedArray
#(1,2,3,4) as SortedArray
col2 := col as KeyedLinkedList
#(4:4,2:2,1:1,3:3)
col2 as LinkedList
#(4,2,1,3) as LinkedList

The return values in the above examples are shown as if they had been printed using the prin method. The printable representations of most of the collection classes always refer to the class of the object as if that object had been coerced, even if it was originally created as an instance of that class. This is so that the printable representation of a collection class is the same as the syntax for creating that collection in a script. See "Printing Collections" on page 164 for more information.

Coercing between subclasses of Collection that are not of similar appearance may have unusual side effects. Table 7-1 describes the effect of many of these coercions.

Table 7-1: Coercions between collection classes

Original Collection

New Collection

Effect of Coercion

Sequences such as Array and LinkedList

Explicitly keyed collections such as KeyedLinkedList

The elements in the original collection become both the key and the value in the key-value pair (that is, the keys and the values in each are the same in the new collection).
#(1,2) as KeyedLinkedList
#(1:1,2:2)

Explicitly keyed collections such as KeyedLinkedList

Sequences such as Array and LinkedList

Only the values in the original key-value pairs are retained. The order of the elements in the new collection may not be the same as in the original collection.
#(1:"one",2:"two") as Pair
#("one","two") as Pair

Unsorted collections such as Array and KeyedLinkedList

Sorted collections such as SortedKeyedArray and SortedArray

The elements in the original collection are sorted.
#(4,3,1,2) as SortedArray

#(1,2,3,4)

Unbounded collections such as Array and KeyedLinkedList

Bounded collections such as Single and Pair

If the unbounded collection contains more elements than the bounded collection can hold, only the initial elements are included in the new collection. All others are discarded. If there are too few elements in the unbounded collection, the remaining elements are set to undefined.
#(1,2,3,4) as Pair
#(1,2) as Pair

Accessing and Changing Elements in Collections

To get access to an element in a collection, use a method such as one of those described in the next section, or use the element access construct, which is actually syntactic shorthand for the getOne method:

collection [ key ]
With this syntax, collection is an instance of a collection class, or an expression that returns a collection. If this collection is an implicitly keyed collection, then key represents the position of an element.

melons := #("honeydew","canteloupe","water","carnegie")
melons[1]
"honeydew"
melons[3]
"water"

If the collection is an explicitly keyed collection, then the key is an explicit key, or an expression that returns a key.

r := #(@one:"money",@two:"show",@three:"get ready",@four:"go!")
r[@two]
"show"

To change elements in a collection, use the same syntax as above, followed by an assignment. This form of element access is syntactic shorthand for the setOne method:

collection [ key ] := newValue
For example:

melons := #("honeydew","canteloupe","water","carnegie")
melons[3] := "cassava"
print melons 
#("honeydew","canteloupe","cassava","carnegie")
melons[5] := "water"
print melons 
#("honeydew","canteloupe","cassava","carnegie","water")

r := #(@one:"money",@two:"show",@three:"get ready",@four:"go!")
r[@four] := "pause"
print r
#(@one:"money", @two:"show", @three:"get ready", @four:"pause")

Summary of the Collection Protocol

This section contains a partial list of generic functions that a Collection object responds to. For a complete listing, see the definition of Collection in the ScriptX Class Reference.

The empty Object

Many of the methods described in the sections below return empty if the element or key you specify is not in the collection. The empty object is a special global constant used by the collection classes to mean "item not found." It should not be confused with undefined. Both empty and undefined are system objects, objects that act as placeholders or constants.

Do not put the empty object into a collection as an element itself. If you assign the empty object to a collection, it becomes difficult to tell whether a method is returning the fact that the specified element was not found, or that it was found, but it was the empty object. Use undefined instead of empty to indicate that an item in a collection has no value.

Functions as Arguments

Some of the generic functions in the collections component take a function as one of their arguments. These generic functions provide a means of iterating through a collection to perform some action. In many cases, the function supplied as an argument is used to select which element or elements from the collection will be returned or operated on by the calling generic function. For instance, removeAll deletes from the collection all elements which cause the function passed to it to return true. A generic function such as chooseAll returns all items which cause the function to return true. In the case of forEach, however, the function is simply applied to each item in the collection, and there is no return value.

These generic functions all take three arguments. The generic function forEach is used to illustrate the form:

forEach collection func arg

collection is the collection to be operated on
func is the function to be applied to the items in the collection
arg is the argument to be passed to func
The function func is iteratively passed two arguments: an item from collection and the arguments contained in arg. If the collection has a natural order (inherits from LinearCollection), the items will be passed to func in that order. The content of the variable arg depends on how many arguments func requires. If func takes no arguments, arg needs to be undefined. If func takes two or more arguments, arg needs to be an array of the arguments.

--calling forEach to print each item to the debug stream
forEach #(2,4,6,8) print debug

The call to forEach in the example above is equivalent to the following code:

print 2 debug; print 4 debug; print 6 debug; print 8 debug

The generic functions chooseAll, chooseOne, chooseOneBackwards, chooseOneBinding, forEach, forEachBackwards, forEachBinding, map, removeAll, and removeOne all take a function as their second argument. See the ScriptX Components Guide and ScriptX Class Reference for more complete information.

Generic Functions for Accessing Elements in Collections

getOne self  key
getAll self key
getNth self ordinal
The getOne method gets the value given by the first instance of key in the collection self; getAll gets all the values associated with the given key. The getNth method, available only on collections that refer to an element's position as its key, gets the value at the ordinal position ordinal, and is the specialized form of getOne for collections with implicit keys. Keep in mind that ScriptX sequences begin at position 1, not position 0, as in some programming languages.

The getOne method is equivalent to the expression self [key].

getOrdOne self  value
The getOrdOne method is available only on collections that refer to an element's position as its key (that is, subclasses of the abstract collection LinearCollection). It returns the position at which the first instance of value is located in the collection self.

getKeyOne self  value
getKeyAll self value
The getKeyOne and getKeyAll methods act like getOrdOne on explicitly keyed collections such as KeyedLinkedList; given the value, they return the first key in the collection self whose value equals value (getKeyOne) or all keys whose values equal value (getKeyAll).

Generic Functions for Adding Elements to Collections

add self  key  value
The add method inserts the key-value pair specified by key and value into the collection self. For sequences such as arrays or lists, key is the position in the list.

addMany self  another
The addMany method appends all the values (or key-value pairs) in the collection another to the collection specified by self. When the collections are both strings, this method can be used to concatenate them.

append self  value
prepend self value
The append method inserts the given value at the end of the collection self; prepend inserts it at the beginning.

Generic Functions for Deleting Elements from Collections

deleteOne self   value
deleteAll self value
deleteNth self ordinal
The deleteOne method deletes the first instance of the given value in the collection self; deleteAll deletes all instances of value from self. With collections that have explicit key-value pairs, both the keys and the values are deleted. The deleteNth method, available only on collections that refer to an element's position as its key, deletes the instance of the element at the ordinal position ordinal.

deleteKeyOne self   key
deleteKeyAll self key
The deleteKeyOne method deletes the first instance of the key-value pair specified by key in the collection self. The deleteKeyAll method deletes all instances of that key and its values.

Generic Functions for Changing Elements in Collections

setOne collection   key   value
setAll collection key value
The setOne generic function changes the binding of the first instance of key to value in the collection collection. If key does not exist, the key-value pair specified by key and value is added to the collection. The setAll method is equivalent to setOne, except that for all instances of key, the associated values are changed to value.

In sequences such as arrays, key is the position of the element in the collection collection, and cannot be greater than the current length of the collection plus one. (If key does not exist in the collection, the key-value pair can be added to the end of the sequence but not to any position beyond that.)

The setOne generic function is equivalent to the expression collection [key]:= value.

Generic Functions for Searching Collections

Collection access methods can be used to select a particular value, key, or key-value binding.

The generic functions that follow are the ones that are most useful for searching collections, and they are also examples of generic functions that take a function as an argument. Such functions have an iterator built in to their implementation. It is possible to create you own iterator and then apply your own operations, but using functions with a built-in iterator is more efficient.

chooseOne self   func   arg
This function calls func once for each item in the collection self, supplying func with the argument arg, until func returns true. It returns the value of the first item in the collection for which func returns true, or empty if no call to this function returns true.

nameArray := #("Hal", "Sid", "Su", "Wanmo", "Sid")
chooseOne nameArray (x -> x = "Sid") undefined
"Sid"
getOrdOne nameArray "Sid" 
2 -- the first occurrence of "Sid" is at ordinal position 2
chooseAll self   func   arg
This function calls func once for each item in the collection self, supplying func with the argument arg. It returns a collection of all the elements in self for which func returns true. This function is probably the most useful one for searching collections.

-- select the numbers over 75
global numArray := #(4, 2.5, 109, 35, 3809, 1.0, 333.333)
chooseAll numArray (x -> x > 75) undefined
#(109, 3809, 333.333)

-- select the numbers over 75 which are integers
chooseAll numArray (x -> x > 75 and \
	(isAKindOf x Integer)) undefined
#(109, 3809)

-- select the numbers over 10000
chooseAll numArray (x -> x > 10000) undefined
#() -- there are no numbers over 10000

Other Useful Generic Functions and Properties

collection.size 
The size instance variable contains the number of elements in the collection. It is a virtual instance variable that is automatically calculated by its getter method. You cannot change it yourself.

isEmpty self
The isEmpty generic function tests whether the collection self is empty. It returns either true or false.

isMember self  value
The isMember generic function tests whether the element specified by value is in the given collection, where value is the value of an explicitly keyed list, or an element in a sequence such as an array.

The following construct is another way to test membership in a collection:

myArray := #(1, 2, 3, 4)
myArray contains 3
true

Printing Collections

Collection objects, just like all objects, have a printable representation that can be printed to a stream using the standard printing functions described in Chapter 3, "Working with Objects" in the section "Output" on page 72. Instances of Array or KeyedLinkedList print in the same syntax as the array and keyed list literals:

print #("this and that") 
#("this and that")
print #("this":"that") 
#("this":"that")

Instances of classes other than Array or KeyedLinkedList use a similar format, with the name of the class after the array or list itself. This representation is also the same as the syntax for coercing between collections.

nums := #(2,3,1) as SortedArray
print nums 
#(1,2,3) as SortedArray

gourmet := #(@capon:"meat",@brie:"cheese") as SortedKeyedArray
print gourmet 
#(@brie:"cheese", @capon:"meat") as SortedKeyedArray

The @normal printed representation of a collection (that is, the one used by the prin generic function) is limited to printing only ten items, with an ellipsis at the end to show that there are more items in the collection. For collections with more than ten items, you may want to print using either the @complete representation, or use a for loop to print all the elements of the collection individually.

series := (1 to 12) as Array
print series
#(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, . . . )
prin series @complete debug
#(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
for i in series do print i
1
2
3
4
. . .

A useful trick for printing collections is to "pipe" them to a printing function. As an intermediate step, you can even pipe them through a sort function or a sorted collection. For more information on the pipe operator, see "Pipes" on page 106. Here is an example of piping:

ScriptXQATeam := #("Bob Cotterill","Rajiv Joshi","Kathleen Peirce",
		"Su Quek","Felicia Santelli","Kim Swix","Maggie Womack",
		"Yosh Kashima","Bill Hogan") 
ScriptXQATeam | SortedArray | print
"Bill Hogan"
"Bob Cotterill"
"Felicia Santelli"
"Kathleen Pierce"
. . .


This document is part of the ScriptX Language Guide, one of the volumes of the ScriptX Technical Reference Series. ScriptX is developed by the ScriptX Engineering Team at Apple Computer, successor to the Kaleida Engineering Team at Kaleida Labs, Inc.

Copyright 1996 Apple Computer, Inc. All Rights Reserved.